PGCon2019 - 3.4
PGCon 2019
The PostgreSQL Conference
Speakers | |
---|---|
Min Wei |
Schedule | |
---|---|
Day | Talks - Day 1 - 2019-05-30 |
Room | DMS 1120 |
Start time | 15:00 |
Duration | 00:45 |
Info | |
ID | 1377 |
Event type | Meeting |
Track | Scaling Out |
Language used for presentation | English |
Feedback | |
---|---|
Did you attend this event? Give Feedback |
VeniceDB
a Peta-byte scale real time analytics service running Postgres on Azure
VeniceDB, a Peta-byte scale real time analytics service running Postgres on Azure
VeniceDB is a large scale OLAP service with a custom build of Postgres/Citus on Azure. Since the public talk at PostgresSV2018, the cluster has grown to 1PB to support Microsoft CoreOS executive decision dashboard that hosts all measures that cover from edge devices to cloud OS. VeniceDB has been in production for about a year, and went through a few major revisions while keeping the service running. During each revision, we keep tuning the data model and indexing to improve the data ingestion and query performance. We also migrated from Postgres10 to Postgres11, which helps to reduce the cluster cost by 30%.
This talk covers why we chose Postgres and Citus as the foundation, and how we build a unified storage to serve various measure data needs. During our journey, we not only replaced traditional MapReduce based cubing jobs but also replaced a columnar storage cluster. We will share our perspectives on row storage vs column storage, and how we compare with other OLAP solutions like Druid, etc. and our wish list for future Postgres improvements.